%!TEX TS-program = xelatex
%!BIB TS-program = biber
%!TEX encoding = UTF-8 Unicode

\documentclass[12pt]{article}

\usepackage{array}

\usepackage{amsmath,amssymb,amsthm,mathtools}
\usepackage{fontspec,xltxtra}
\setromanfont{Times New Roman}

\usepackage{geometry}
    \geometry{letterpaper}

\usepackage{setspace}
\usepackage{fullpage}
\usepackage{multirow}

\usepackage{pdflscape}
\usepackage{booktabs}

\usepackage{graphicx}
\usepackage{caption}
\usepackage{subcaption}
	\captionsetup[figure]{labelfont={rm,bf}, textfont={rm,it}}
	\captionsetup[table]{labelfont={rm,bf}, textfont={rm,it}}
\usepackage{float}
\usepackage{enumitem}
\usepackage{authblk}
\usepackage{xcolor}

\usepackage[authordate, backend=biber]{biblatex-chicago}
    \addbibresource{chp_apsr_refs.bib}

\usepackage{tikz,pgfplots}
    \usetikzlibrary{calc,patterns,decorations.pathreplacing}
    \usepgfplotslibrary{fillbetween}
    \pgfplotsset{compat=1.15}
    \tikzstyle{every picture}+=[font=\small]
    \tikzset{>=latex}

\usepackage{hyperref}
\usepackage{url}

\newtheoremstyle{mystyle}
  {18pt} % Space above
  {6pt} % Space below
  {} % Body font
  {0in} % Indent amount
  {\bfseries} % Theorem head font
  {.} % Punctuation after theorem head
  {0.50em} % Space after theorem head
  {} % Theorem head spec (can be left empty, meaning `normal')

\usepackage[nameinlink,noabbrev]{cleveref}
	\theoremstyle{mystyle}
 	\newtheorem{lemma}{Lemma}
	\newtheorem{corollary}{Corollary}
	\newtheorem{defn}{Definition}
	\newtheorem{assumption}{Assumption}
	\newtheorem{example}{Example}
	\newtheorem{remark}{Remark}
	\newtheorem{definition}{Definition}

\usepackage{tocloft}
\usepackage[normalem]{ulem}

\usepackage{pdflscape}
\usepackage{longtable}
\usepackage{placeins}

\usepackage[normalem]{ulem}

% Prevent the hyphenation in works cited page
\makeatletter
\AtEveryBibitem{\global\undef\bbx@lasthash}
\makeatother

\newcommand{\mytitle}{Trading Diversity? Judicial Diversity \\and Case Outcomes in Federal Courts}

\newcommand{\mydate}{May 30, 2024}

\newcommand{\mainauthor}{Ryan Copus\footnote{Ryan Copus is an Associate Professor at the University of Missouri--Kansas City School of Law, 500 E. 52nd Street, Kansas City, MO 64110. Email:~copusr@umkc.edu.}~~~~~~~~~~Ryan Hübert\footnote{Ryan Hübert is an Assistant Professor of Political Science at the University of California, Davis, One Shields Avenue, Davis, CA 95616. Email:~rhubert@ucdavis.edu. \textbf{Corresponding author.}}~~~~~~~~~~Paige Pellaton\footnote{Paige Pellaton is a Ph.D. candidate in Political Science at the University of California, Davis, One Shields Avenue, Davis, CA 95616. Email:~ppellaton@ucdavis.edu.}}

\newcommand{\siauthor}{
\noindent Ryan Copus, UMKC School of Law\\
Ryan Hübert, UC Davis\\
Paige Pellaton, UC Davis
}

\begin{document}

\setcounter{page}{0}

\title{\mytitle\footnote{Authors are listed in alphabetical order. We thank Sarah Anzia, Rachel Bernhard, Chris Cotropia, Scott Daniel, Sean Gailmard, Luzmarina Garcia, Mark Hurwitz, Jonathan King, Gabe Lenz, Adam Pah, David Schwartz, Moses Shayo, Rorie Solberg, Jörg Spenkuch, Laura Stoker, Dvir Yogev, and audience members at UC Berkeley, University of Chicago, LSE, Oxford, MPSA 2022, APSA 2022, and CELS 2022 for their helpful feedback. We are especially grateful to the Systematic Content Analysis of Litigation EventS (SCALES) Open Knowledge Network (\url{https://scales-okn.org/}) for the use of their data. All errors and omissions are our own.}}

\author{\mainauthor}

\date{\mydate}

\maketitle

\vspace{-1em}

\begin{abstract}
    \noindent 
    Are federal lawsuits resolved differently based on the race or gender of the judges assigned to hear them? Recent empirical research posits that women and judges of color decide cases more liberally, at least in some identity-salient areas of law. However, these studies analyze small numbers of cases and judges, and use research designs that limit their causal interpretations. Using an original dataset of all civil rights cases filed in 20 federal district courts over multiple decades and a strong causal identification strategy, we find that assignment of cases to judges of color or women has no statistically significant effect on case outcomes among Democratic appointees. However, it causes more conservative outcomes among Republican appointees. We explain these results with a theory of bargaining over judicial appointments in which Republican presidents take advantage of Democrats' preference for diversity on the bench to appoint more conservative judges.

    \vspace{1em}
    
    \noindent\textbf{Keywords:} judicial politics, trial courts, causal inference, diversity
    
\end{abstract}

\begin{center}

\vspace{1em}

{\small \textbf{Replication files are available in the Harvard Dataverse at:}\\\url{https://doi.org/10.7910/DVN/I8B3VS}.}

\vspace{1em}

10,579 words

\end{center}

\thispagestyle{empty}

\newpage

\part*{}

\begin{quote}
    \small 
    \textit{There will not be an ideological blood test, like there was during the Reagan and Bush years, to see if the candidate is a moderate or liberal. But there will be an insistence upon diversity.}
    
    \hfill -- Senate Judiciary Committee Chair Joe Biden in 1993\footnote{Quoted in \textcite{labaton_1993}.} 
\end{quote}

\doublespace

The beginning of the Biden Administration was marked by a concerted effort to diversify the federal judiciary. In addition to promising to nominate a Black woman to the Supreme Court---and doing so---the newly-elected president set to work nominating a historically diverse slate of judges to the lower courts. Two-thirds of the judges appointed during his first year in office were judges of color and 80 percent were women, far surpassing the efforts of his predecessors to increase diversity on the bench in their first years. Biden's insistence on a more inclusive judiciary, following President Trump's disinterest in the issue, has reignited a debate about the importance of having more women and judges of color on the bench.

There are many claims made by observers, activists, and political actors about the impacts of diversifying the federal bench. Of particular interest is the extent to which judges from historically underrepresented racial, ethnic, or gender backgrounds approach cases differently than the White men who have traditionally been appointed to the bench. A large body of quantitative research has developed to examine this question. Notwithstanding a handful of exceptions, the recent conventional wisdom that has emerged from this research is that women and judges of color resolve cases in a more liberal manner, at least in some identity salient areas of law \parencite[see ch. 3 of][]{friedman2020}.

Our analysis of an original dataset of federal civil rights cases makes three important contributions. First, we focus attention on the federal district courts, which have received disproportionately less attention from scholars than federal appeals courts. A large majority of federal judges are seated in district courts, including the vast majority of women and judges of color. The district courts are also the workhorses of the federal judiciary, providing final resolution in more than 90 percent of federal lawsuits. As U.S. District Judge Henry N. Graven once famously said, ``The people of this district either get justice here with me or they don't get it at all'' \parencite[quoted on p.~1 of][]{carp_1996}. 

Second, we take great effort to eliminate post-treatment bias. Many prior research findings are vulnerable to post-treatment bias because they are based on datasets that exclude certain cases based on their outcomes \parencite[e.g., excluding settled cases, see][]{hubert_copus_jop}. To be fair, large and comprehensive datasets of federal cases are difficult to create since important data on federal cases are contained in court records that are kept behind an expensive paywall (called ``PACER'', see \url{https://pacer.uscourts.gov}). Our data collection process underscores the seriousness of this barrier: long waits for each court to decide whether to issue a researcher fee exemption; emails from various authorities concerned about our data collection; and a \$30,000 bill for an error in the data collection process.\footnote{Thanks to the graciousness of one chief judge, we ultimately did not have to pay the \$30,000 bill.}

Third, we estimate the effect of assigning cases to judges of different races or genders \textit{without} controlling for other judge characteristics. Almost universally, prior research on diversifying the federal bench tries to ``isolate'' the effect on outcomes of a judge being a member of a minority group or being a woman. They do so by controlling for other judicial characteristics like Judicial Common Space scores, prior professional experience, law school attended, etc. But the judicial appointment process is highly political and strategic; nominees' identities play an important role in debates over their nominations, well before they are even seated on the bench. As a result, the women and judges of color that emerge from the appointment process may differ significantly from the White men that emerge from it. This means that many correlations between judges' races and/or genders and their other characteristics could be substantively important \textit{consequences} of diversifying the bench. To draw lessons about whether diversity on the bench affects case outcomes, we do not want to treat those differences as inconveniences that need to be controlled for in a regression.

Most importantly, by not controlling for judicial ideology specifically, our empirical approach can detect effects that may be a consequence of the way that political actors strategically use diversity to achieve ideological goals in the judicial appointment process. It is well understood that ideological alignment is at the center of the appointments process---presidents and senators seek to appoint judges who share their ideological leanings. But racial and gender diversity has also been salient in the appointments process. How might these salient features in the appointment process---ideology, race, and gender---combine to affect who is ultimately appointed to the bench? By not controlling for judges' political ideologies, our analysis allows us to detect the ideological valence of the decisions made by appointees of different races or genders. In this respect, our analysis speaks to a recent literature showing how various types of selection effects explain documented gender and racial differences in both the composition and behavior of public officials \parencite[see, for example,][]{anzia_berry_2011,butler_preece_2016,fields_2016,teele_kalla_rosenbluth_2018,broockman_soltas_2020,bernhard_dbk_2021,folke_rickne_smith_2021}.

By not controlling for other judicial characteristics, we also hope to alleviate some ambiguity that surrounds discussions of diversity in judging. Conditional effects (e.g., women make more liberal decisions in sex discrimination cases, after controlling for ideology and other factors) are often reported as overall effects (e.g., women make more liberal decisions in sex discrimination cases). Our aim is to estimate clearly communicable effects: the effect on case outcomes, within each party, of assigning cases to female judges or judges of color rather than White male judges.\footnote{Our approach to studying diversity on the bench has the additional benefit of avoiding the methodologically fraught exercise of trying to causally estimate the effect of judges' races or genders \parencite[see][]{greiner_rubin_2011,sen_race_2016}.}

Our analysis is one of the largest empirical studies of federal cases to date. We collected and analyzed an original dataset of all civil rights cases filed in 20 federal district courts over multiple decades, totaling around 260,000 cases heard by 545 federal judges. The 20 district courts in our dataset are among the largest and most impactful in the nation. Combined, they have jurisdiction over 40\% of the U.S. population, seat 40\% of the federal district judges, and resolve 40\% of federal civil rights lawsuits. We further supplement our analysis with a second dataset that includes every civil rights case filed in all federal district courts over two years.\footnote{The court records required to construct this dataset were provided by the Systematic Content Analysis of Legal EventS (SCALES) Open Knowledge Network (see \url{https://scales-okn.org/}).}

We focus on civil rights lawsuits because it is one of the most impactful areas of federal law. Many of the cases in our dataset were brought under some of the nation's landmark civil rights laws, such as the Civil Rights Act of 1964 and the Americans with Disabilities Act of 1990. At a more practical level, civil rights cases have been the focus of many prior studies on judicial diversity, and they also allow for more straightforward interpretations of effects in ideological terms: on average, outcomes favoring plaintiffs are ``liberal'' and outcomes favoring defendants are ``conservative.''\footnote{As we discuss below, a hand-coding of a random sample of complaints shows that only a very small percentage of plaintiff claims could plausibly be characterized as advancing conservative causes (e.g., a White employee suing for race discrimination).}

We statistically test whether the judges of color and women appointed by presidents of each party, whom we collectively term ``nontraditional appointees,'' cause different case outcomes relative to the White men appointed by presidents of the same party, whom we term ``traditional appointees.'' There is substantial variation in the terms that prior scholars use to distinguish White and/or male judges from judges of color and women. We choose to borrow our terminology from \textcite{haire_moyer_book}, one of the most widely cited studies on judicial diversity.

In contrast to the conventional wisdom, we do not find that nontraditional judges generate more liberal outcomes in an identity-salient area of law---civil rights cases. Among Democrats, we find no differences in average outcomes between cases assigned to nontraditional appointees and cases assigned to traditional appointees. Among Republicans, nontraditional appointees cause more conservative outcomes: they cause fewer settlements than traditional appointees, by approximately 2.2 percentage points, and more defendant wins, by approximately 1.2 percentage points. We confirm these findings with a supplemental analysis of a nationally representative dataset. For context, these within-party differences among Republican appointees are in the same direction and slightly larger than the effects we estimate for assigning cases to Republican rather than Democratic appointees. We do not find strong evidence that any of the effects differ among specific subsets of nontraditional appointees, such as Black judges, Latino judges, White women, etc.

We also test whether particular types of plaintiffs benefit from assignment to a nontraditional appointee. A commonly made claim is that judges of color will generate more favorable outcomes for people of color and female judges will generate more favorable outcomes for women \parencite[see][]{shayo_zussman_2011,harris_bias_2019}. We examine this issue by looking to see if judges of color and female judges cause different outcomes in cases filed by plaintiffs of color and female plaintiffs, respectively. We do not find evidence of substantial differences. 

Admittedly, our results do not provide direct causal evidence of the historical effect of diversifying the federal bench. For example, the increasing presence of nontraditional appointees could influence the decision-making of their White male colleagues. Nor do we know the counterfactual decisions of the White men who would have been appointed had presidents decided to appoint fewer judges of color and women. However, the results do suggest that the historical appointment process created a situation in which Republican-appointed women and judges of color resolved cases more conservatively than Republican-appointed White men, whereas Democratic appointees resolved cases the same way regardless of their races or genders. 

We theorize that the difference between the appointees of the two parties is explained by asymmetry in Republican and Democratic preferences for diversity. Since presidents must get their nominees confirmed by the Senate, we argue that the empirical pattern is consistent with a standard theory of bargaining over nominees in which Republican politicians place less importance on diversity on the bench than Democratic politicians do. According to the theory, this kind of preference asymmetry allows Republican presidents to ``trade diversity'' in exchange for nominating women and judges of color who will make more conservative decisions. We outline the logic of this theory below and analyze a formal model in Online Appendix D.

\section*{Racial and Gender Diversity on the Federal Bench}

Historically, the federal judiciary has been composed of mostly White men. In fact, the first woman was not appointed to a federal Article III court until 1934 (Florence E. Allen), and the first Black judge was not named until 1950 (William Henry Hastie). Since the Carter administration, presidents have made concerted efforts to appoint more women and judges of color. We show these trends in Figure \ref{fig:diversification}.

\begin{figure}[ht]
    \centering
    \caption{Each panel plots a bar chart showing the total number of active and senior Article III federal judges serving on January 1 of each year, broken down by the party of the appointing president as well as the race and gender of the judges. Years in which Democratic presidents were appointing judges have a light gray background.}
    \label{fig:diversification}
    \includegraphics{_img/fg1.pdf}
\end{figure}

What role do the racial or gender identities of judges play in how cases get resolved? Examining a wide range of civil cases heard in the federal courts of appeals, studies have coalesced around the notion that---after controlling for some other observable judicial characteristics---a judge's gender and race are associated with different outcomes in cases related to ``racialized'' or ``gendered'' issues, like employment discrimination \parencite{morin_voting_2014, songer_reappraisal_1994, farhang_institutional_2004}, affirmative action \parencite{kastellec_racial_2013}, and sex discrimination \parencite{boyd_untangling_2010}.\footnote{In an interesting twist, \textcite{glynn_identifying_2015} determine that, among appellate judges, having a daughter induces a man to decide cases in a more ``feminist'' direction in gender discrimination cases than those men without daughters.} These studies are unified by their common conclusion that women or racial minorities induce more liberal (pro-plaintiff) outcomes, but there are a few exceptions (e.g., \cite{morin_voting_2014} on Latino judges in employment discrimination cases, and \cite{songer_reappraisal_1994} on female judges in obscenity and search and seizure cases).

At the trial court level, an early contribution by \textcite{ashenfelter_politics_1995} did not find evidence that, controlling for a host of other factors, women resolve civil rights cases differently than men. Most recently, studies have concluded that cases assigned to women in the district courts are---adjusting for many other factors---more likely to result in pro-plaintiff outcomes in sex/pregnancy discrimination cases \parencite{boyd_representation_2016} and settled more frequently in civil rights cases \parencite{boyd_shell_2013}. Controlling for other factors, Black judges increase pro-plaintiff outcomes in race discrimination suits \parencite{boyd_representation_2016}, and judges of color are more likely than White judges to produce pro-claimant outcomes in Social Security disability cases \parencite{boyd_judicial_2020}.

Why might the races or genders of judges affect case outcomes?\footnote{Some have also theorized that judges' races and genders should not affect case outcomes since judges are all similarly socialized by the legal profession and should thus approach cases similarly \parencite[see][]{boyd_untangling_2010}.} Scholars have focused on three main theories. The first posits that women and people of color have ``different voices'' they bring to the bench, leading them to resolve cases differently. The second theory holds that women and people of color---due to their different experiences in life---bring different information to judging. Finally, the third theory says that female judges and judges of color will act as substantive representatives of women and people of color, respectively, advocating for those groups' interests in their judicial decision-making. These theories are summarized in several recent papers, so we refer readers to those for more detailed descriptions \parencite[][]{boyd_untangling_2010,boyd_representation_2016,harris_bias_2019}.

This literature has provided a large set of influential empirical findings. But it has some limitations. In addition to a relative lack of focus on the district courts, prior studies use research designs that are vulnerable to post-treatment statistical bias and that may partly obscure differences between judges of different races and genders. They also limit researchers' ability to detect selection effects.

\subsubsection*{Common Research Designs Introduce Post-Treatment Bias}\label{sec:all-else}

Federal court cases are typically assigned to judges randomly. Barring any flukes in random assignment and assuming that researchers account for the structure of the random assignment in the estimation process, this institutional feature provides an opportunity for a ``natural experiment'' that ensures differences in outcomes across judges are due to genuine differences between judges and not simply that different judges are assigned different kinds of cases. Unfortunately, many prior studies use research designs that do not exploit this random assignment and are vulnerable to statistical bias.

Most notably, many studies present statistical analyses that condition on post-treatment variables \parencite[for discussions of post-treatment bias, see][]{rosenbaum_1985,montgomery_ptbias_2018,knox_admin_data_2020}. The most common way this manifests in courts research is when a researcher performs statistical analyses on subsets of cases that end in certain ways. For example, both \textcite{chew_myth_2009} and \textcite{collins_jr_gender_2010} analyze samples of cases with published opinions. This is problematic because judges choose whether to publish opinions. Different judges may have different proclivities toward publication that could be correlated with other case characteristics. But there are other sources of post-treatment bias, such as controlling for case-level variables that occur after judges are assigned to cases (e.g., whether a case yielded a published opinion).

Many studies of diversity make these research design choices for theoretical or conceptual reasons. For example, some exclude settled cases based on the idea that judges can only ``cause'' outcomes where judges issue judgments or orders that end cases. We think this is an overly narrow view of the ways that judges can cause cases to end differently. Assignment of cases to different judges may cause outcomes that do not involve an ultimate judicial decision. For example, judges can affect case settlements through direct pathways, such as putting pressure on parties to settle. But their impact may also be through more indirect pathways if litigants make strategic decisions based on which judge is assigned to their cases. For example, if a judge who is known (or even simply believed) to be favorable to civil rights plaintiffs is assigned to a civil rights case and the defendant decides to offer a settlement to avoid having their case overseen by a hostile judge, then this is a causal effect of that judge having been assigned to the case. 

The possibility of statistical bias is a serious concern that undermines the interpretation of estimated effects. If, for example, some judges induce more settlements because they are known to be favorable to plaintiffs, then dropping cases that are settled from a dataset will disproportionately drop cases heard by these pro-plaintiff judges. There is mounting empirical evidence that the concern over post-treatment bias is justified. Recent research on the U.S. Courts of Appeals demonstrates that characteristics of published opinions appear to be correlated with the partisan make-up of the panels issuing those opinions \parencite{carlson_etal_2020}. In the context of district courts, \textcite{hubert_copus_jop} show that subsetting to cases that end in certain ways biases the effect of judge partisanship toward zero, potentially causing scholars to mistakenly conclude that political ideology matters less in district courts than other levels of the federal judiciary.

Even when studies rely on the random assignment of cases (and do not select on outcome variables), they do not always use estimation strategies that account for the randomization process, nor do they present evidence that their estimation strategies successfully exploit random assignment (e.g., balance tests). Random assignment of cases typically occurs within divisions of a district soon after cases are filed, not---as is often assumed---within districts. This may be a serious oversight. For example, it is well known that litigants purposefully choose to file cases in different divisions within a district in order to increase the odds of getting a more favorable judge \parencite{botoman_2018}. 

\subsubsection*{Trying to ``Isolate'' the Effect of Judges' Identity Features Hides Important Selection Effects}

Most quantitative studies of diversity in judging are framed around theories of judicial behavior that seek to explain \textit{why} men and women, as well as White judges and judges of color, might generate different case outcomes. We briefly described these above. A common goal of these studies is to address potential confounding caused by other judge characteristics (like judges' political ideologies). To do so, they typically include sets of judge-level control variables in their statistical analyses in order to ``isolate'' the specific effect of judges' races or genders on case outcomes. In our review of prior studies, nearly all of them include judge-level control variables.\footnote{Some of these papers report descriptive (or ``naive'') estimates without judge-level controls, but it is unclear whether these estimates are intended to have causal interpretations. For example, reporting the difference in pro-plaintiff rates between men and women does not provide an ``effect'' of gender without some effort to exploit random assignment (e.g., include division-year fixed effects) or otherwise ensure that case characteristics are similar across treatment and control groups (see our discussion above).}

There are several problems with this. First, there is a standard selection on observables critique one could make about studies that control for specific judge-level characteristics. It is very difficult, if not impossible, to estimate an unbiased effect of a specific characteristic that is likely to be jointly determined with many other unobserved characteristics. We do not explore this critique in detail here, since there is a robust methodological literature on the topic \parencite[see][]{greiner_rubin_2011,sen_race_2016}.\footnote{There is also a compelling conceptual critique of this style of analysis. It implies that there is something like an essence of race or gender that can be empirically discovered once one controls away various other factors that, in reality,  may be part of the complex social construction of a person's racial or gender identity.} Moreover, since different studies control for different judge-level variables---usually some combination of a judge's race, gender, age, ideology, tenure, prosecutorial experience, law school, religion, and prior judicial experience---it's unclear how to compare effects across studies.

There is another problem: reporting conditional effects can conceal insights about the overall impact of diversity on the bench. Consider a hypothetical dataset that generates the summary statistics presented in Table \ref{tab:control_example}. A researcher is interested in seeing whether men or women are more favorable to plaintiffs, but she also has an intuition (based on theory) that ideologically liberal judges tend to rule for plaintiffs more often than ideologically conservative judges. She accordingly decides any regression looking at the effect of judge gender on case outcomes should include a control variable for judges' ideologies.

\begin{table}[ht]
    \centering
    \caption{Each cell contains the proportion of pro-plaintiff decisions for each combination of the two judge-level characteristics in a hypothetical dataset.}
    \label{tab:control_example}
    \begin{tabular}{r|c|c|}
    \multicolumn{1}{c}{} & \multicolumn{1}{c}{~~~Men~~~} & \multicolumn{1}{c}{Women}\\\cline{2-3}
    \multicolumn{1}{r|}{Ideologically Liberal} & 60\% & 65\% \\\cline{2-3}
    \multicolumn{1}{r|}{Ideologically Conservative} & 25\% & 30\% \\\cline{2-3}
    \end{tabular}
\end{table}

An analysis like that would find that women are 5 percentage points more likely to rule for plaintiffs than men, holding constant judges' ideologies. With this finding in hand, suppose that a hypothetical paper's abstract reports: ``Assignment of cases to women causes those cases to end in a pro-plaintiff way more often than assignment to men.'' However, this statement may be misleading. Suppose now that the women in this dataset are more likely to be ideologically conservative than men in this dataset. Depending on how different the pools of men and women are, it is possible that the assignment of cases to women actually decreases the rate of pro-plaintiff decisions!\footnote{To be more concrete, suppose that 20 percent of the women in the sample are ideologically liberal whereas 40 percent of the men are. Then, the pro-plaintiff rate for women is $0.65 \times 0.20 + 0.30 \times 0.80 \approx 37\%$ and for men it is $0.60 \times 0.40 + 0.25 \times 0.60  \approx 39\%$. Clearly, assignment to a woman would \textit{not} increase the likelihood of a pro-plaintiff outcome in that situation.}

This is an example of Simpson's Paradox, and it highlights the pitfalls of making claims about overall effects based on conditional estimates. This is substantively meaningful in our context: to know whether diversity on the federal bench affects case outcomes one would, at a minimum, want to know the overall effect. Yet the overall effect may be different in magnitude (or even direction) than a conditional effect. This is because differences in other judge characteristics might be an important \textit{consequence} of diversifying the bench. Indeed, our own analysis demonstrates that nontraditional Republican appointees generate more conservative case outcomes than traditional Republican appointees, teaching us something important about the kinds of judges Republican presidents have put on the bench. Had we controlled for judges' ideologies, we may have inadvertently hidden this finding.

An important clarification is necessary: controlling for judge characteristics is not the same thing as looking at treatment effects among subsets of judges.\footnote{At a more technical level, the former involves adding judge characteristic variables to a regression or matching algorithm, and the latter involves \textit{interactions} between the main treatment variable and these additional judge characteristics or performing separate analyses on subsets of a dataset.} For example, since politicians from the two parties seem to approach the issue of diversity on the bench differently, there is a clear conceptual rationale for performing within-party analyses by comparing nontraditional appointees with the White men appointed by presidents of the same party. This is what we do in our analysis. We are examining conceptually-grounded heterogeneous treatment effects, not trying to ``control for'' a judge's political ideology, as is the standard rationale for including this variable in other judicial politics research.

Prior research studies often include judge-level control variables in order to test various hypotheses derived from theories of judicial behavior, and especially theories of race and gender in judging. Our focus is different since we seek to better understand whether the creation of a more diverse judiciary has affected case outcomes. As a result, we have no clear reason to include judge-level controls in our statistical models. But even if we were examining such theories of judicial behavior in this article, our core point still applies: including judge-level controls would weaken our ability to make accurate causal claims about the impact of diversity on the bench. When one includes judge-level controls in their main analyses, this is akin to skipping past the question ``Do these judges have an effect?'' and jumping straight to the question ``\textit{Why} do these judges have an effect?''\footnote{We think this is, at least partially, due to an approach toward research that emphasizes deriving and articulating hypotheses from theory which can be ``tested'' with data. Our approach is \textit{design based} \parencite[in the causal inference sense, see][]{credibility2010}; we focus on estimating unbiased causal effects and then follow up with a novel theoretical framework that helps explain how those effects could have arisen (see below).} We think it is important to convincingly estimate unbiased effects before exploring the mechanisms driving those effects \parencite[see also p.~173 of][]{friedman2020}.

\section*{Data and Research Design}\label{sec:data-research}

We constructed an original dataset of civil rights cases filed in 20 federal district courts. For most of these courts, our dataset includes all civil rights cases either filed between 1995 and 2016, or filed between 1995 and 2020. However, for three of the smaller courts, our dataset spans fewer years. Because our identification strategy requires us to estimate effects within district and within year, the slight differences in year coverage across the courts in our dataset do not create methodological problems. In Figure \ref{fig:our_courts_text}, we show the composition of our dataset.

\begin{figure}[H]
    \centering
    \caption{We show the number and percentage of cases in our dataset drawn from each of the 20 courts included in our analysis. For each court, we also use color shading to indicate the year range for which we have data from that court.} \label{fig:our_courts_text}
    \includegraphics{_img/fg2.pdf}
\end{figure}

Our dataset contains information about each case's characteristics as well as the presiding judge. Most of our case-level variables are drawn from the publicly available version of the Federal Judicial Center's Integrated Database (known as the ``IDB'' and available at \url{https://www.fjc.gov/research/idb}). We identified civil rights cases using the nature of suit (NOS) code variable in this dataset (i.e., those with NOS codes beginning with 44). In Online Appendix A, we provide additional information about these cases, both from the IDB and from a hand-coded random sample of cases in our dataset.

The publicly-available version of the IDB redacts the judge name from each case and only contains rudimentary information about each case's litigants. We add information about each case's litigants and presiding judge from an original database of docket sheets that we collected from the federal courts' fee-based online records system (called PACER) in connection with our ongoing research on judicial decision making. Twenty district courts from six different circuits and in four different regions of the U.S. issued us fee waivers that enabled us to access PACER for free. After more than a year of data collection and data cleaning, we were able to link judge-identifying information from these court records to the IDB. 

Biographical data for each judge appearing in our dataset---and most importantly, each judge's race and gender---is drawn from the FJC's Biographical Directory of Article III Federal Judges (\url{https://www.fjc.gov/history/judges}). We follow prior research on federal judges and take the FJC's Biographical Directory as an authoritative source of information about each judge's racial and gender identity. We merge our three data sources together using both case numbers and judge names.

We supplement our main analysis with an analysis of a second dataset that contains all civil rights cases filed in every federal district court in 2016 and 2017. The docket sheets for this dataset were graciously provided by the Systematic Content Analysis of Litigation EventS (SCALES) Open Knowledge Network, which is working to build a platform that will provide open access to federal court data. We constructed this second dataset, which we call the SCALES dataset, using the same steps as we took for our main dataset.

Our statistical analyses focus on estimating the effect that assigning cases to nontraditional appointees instead of traditional appointees has on the outcomes of civil rights cases. Because our research design allows us to approximate a randomized experiment, we often describe the assignment of cases to nontraditional appointees as the ``treatment'' and the assignment of cases to traditional appointees as the ``control.'' We also decompose this treatment variable and show separate estimates for several subgroups of nontraditional appointees. Our choice to define ``treatment'' and ``control'' in this way is simply a matter of labeling; our results would be identical (but with the opposite sign) if we reversed this. In Figure \ref{fig:our-judges}, we provide more detail about the racial and gender breakdown of the judges in our main dataset. In Online Appendix A, we provide this same information for the SCALES dataset.

\begin{figure}[H]
    \centering
    \caption{We plot the number of judges in our main dataset, broken down by judges' races, genders, and partisanship.} 
    \label{fig:our-judges}
    \includegraphics{_img/fg3.pdf}
\end{figure}

District court cases can end in many different ways. In our analysis, we focus on the two most prevalent case outcomes in our dataset: settlements (45\% of cases) and defendant wins (i.e., both involuntary dismissals and judgments favoring the defendant, together comprising 33\% of cases). We code these case outcomes using both the IDB and our database of docket sheets. Online Appendix A provides additional descriptions of these outcomes and additional details about our coding process.\footnote{To briefly summarize, we start with the IDB's ``DISP'' and ``JUDGMENT'' variables, and then use our docket sheet database to identify and recode case outcomes that are known to be systematically miscoded in the IDB \parencite[see][]{hadfield_2004}.}

Civil rights cases have a relatively clear political directionality to them (at least on average) since they almost always involve a plaintiff alleging a civil rights violation by a defendant.\footnote{For example, using the logic of the case space \parencite[see][]{lax_2011}, a judge with a more liberal interpretation of discrimination law may be inclined to rule in favor of plaintiffs who present evidence that would be insufficient to convince a judge with a more conservative interpretation of discrimination law.} We manually reviewed a random sample of cases and found that in only 3\% of them could the plaintiff's legal claims plausibly be classified as ideologically conservative (e.g., a claim that an employee was discriminated against on the basis of being White). As a result, we interpret defendant wins as more ``conservative.'' We interpret settlements as more ``liberal'' relative to defendant wins.\footnote{The latter interpretation is aided by the fact that we observe settlements and defendant wins trading off for one another in our main analysis.}

In order to provide unbiased estimates of the effect of assigning nontraditional appointees to cases, we rely on the assumption that judges are randomly assigned to cases. However, for this to be a reasonable assumption, we take several steps that we outline in detail in Online Appendix B. In brief, we first drop subsets of cases (based on pre-treatment characteristics) that we suspect are not randomly assigned, and then we perform an aggressive statistical test of case randomization.

After this data cleaning, our randomization test provides convincing statistical evidence that the remaining cases in our dataset were randomly assigned to judges within each district's division and year. We can thus conceptualize our research design as a blocked (natural) experiment with as-if random treatment assignment within district-division-year randomization blocks. We use regression adjustment to account for the district-division-year randomization blocks, using the strategy described in \textcite{lin_2013} and research design recommendations provided in \textcite{green_lab_2016}.

Because our effects are only causally identified within each randomization block, we can only include randomization blocks that have sufficient variation in the treatment and control variables. This means our sample size varies across our analyses. We conduct our main analysis on 264,889 civil rights cases heard by 250 nontraditional appointees (164 appointed by Democratic presidents and 86 appointed by Republican presidents) and 295 traditional appointees (131 appointed by Democratic presidents and 164 appointed by Republican presidents). In all our figures, we include the sample size and number of judges for each analysis. We limit our analysis to cases assigned to judges appointed by Presidents Carter through Obama.

This estimation strategy allows us to recover credible causal estimates of the effect of assigning cases to nontraditional appointees, relative to assigning cases to White men. As we note above, we estimate effects separately for Democratic and Republican appointees to allow for the possibility that the effect of the assigned judge's race or gender on case outcomes matters differently depending on whether he or she is a Democratic or Republican appointee.

The regression model we use for each of our analyses is:
\begin{align*}
    Y_{i}^p = \alpha^p
    + \beta^p \cdot N_{i}^p 
    + \sum_{dy} \left\{\phi_{dy}^p \cdot X_{idy}^p 
    + \gamma_{dy}^p \cdot N_{i}^p \cdot (X_{idy}^p - \overline{X}_{dy}^p)\right\} 
    + \varepsilon_{i}^p
\end{align*}
where $i$ indexes cases, $dy$ indexes a court division and case filing year (i.e., a randomization block), and $p$ indexes the party of the appointing president. The variable $N_{i}^p$ takes a value of 1 if case $i$ is assigned to a nontraditional appointee and 0 if it is assigned to a traditional appointee. $X_{idy}^p$ is a dummy variable indicating whether case $i$ is in randomization block $dy$, and $\overline{X}_{dy}^p$ is the proportion of cases in our sample heard within randomization block $dy$.

Our main estimate of interest for each analysis is the estimate for $\beta^p$, which gives the average treatment effect of a case being assigned to a nontraditional appointee of party $p$. We cluster standard errors at the judge-level. We use the \texttt{estimatr} library for the R statistical programming language to estimate effects and standard errors \parencite{estimatr}.

\section*{Does Diversity on the Bench Benefit Civil Rights Plaintiffs?}

We begin our analysis by estimating whether nontraditional appointees cause different case outcomes than traditional appointees. We plot these effects in the two left panels in Figure \ref{fig:fg4}. Among Democratic appointees, there is no statistically significant difference in the average outcomes of the civil rights cases assigned to nontraditional appointees versus those assigned to traditional appointees. However, among Republican appointees, civil rights cases are less likely to end in settlements and more likely to end in wins for the defendant when assigned to a nontraditional appointee instead of a traditional appointee. Specifically, when a case is assigned to a nontraditional Republican appointee instead of a traditional Republican appointee, the case is 2.2 percentage points less likely to settle ($p$-value < 0.01) and 1.2 percentage points more likely to end in a judgment for the defendant ($p$-value < 0.05).

\begin{figure}[ht]
    \centering
    \caption{   
    Each point plots an average treatment effect on a specific case outcome (depicted on the $x$-axis), along with a 95\% confidence interval using judge-clustered standard errors. In the left two panels, we plot our main effects. In the right panel, we plot the average treatment effect of assigning Republican appointees to cases (instead of Democratic appointees), which we provide for comparison. For each estimate, we present the number of cases in the analysis (top number) and the number of treatment/control judges (bottom number). Full results for this plot are available in Table E.1 in Online Appendix E.}\label{fig:fg4}
    \includegraphics{_img/fg4.pdf}
\end{figure}

To place these effect sizes in substantive context, we also estimate partisan differences in case outcomes. Specifically, for each of the two outcomes, we estimate the average treatment effect of assigning cases to Republican appointees instead of Democratic appointees using the same estimation strategy as described above. In the right panel of Figure \ref{fig:fg4}, we plot the estimates for these partisan effects. Assignment to a Republican appointee instead of a Democratic appointee decreases the probability of settlement by 1.6 percentage points and increases the probability of a defendant win by 0.9 percentage points (although the latter effect is not statistically significant at the 0.05 level). Our effects for nontraditional Republican appointees are both in the same direction and similar in magnitude to the effects for assignment to Republican appointees rather than Democratic appointees. This gives us further confidence that settlements can roughly be interpreted as more liberal outcomes in civil rights cases and defendant wins can roughly be interpreted as more conservative outcomes in civil rights cases.

While these effect sizes may seem small in magnitude, it is well-known that many cases filed in federal district courts are frivolous. Effects are likely to be concentrated in the cases that are not frivolous, but identifying those cases with existing datasets and without introducing post-treatment bias is a challenge.\footnote{Many prior research studies attempt to focus on more ``important'' cases by, for example, analyzing only published opinions or cases that ended in a formal judgment. Unfortunately, this approach introduces post-treatment bias.}

Though our main dataset is expansive (covering 40\% of the U.S. population), it is not a national sample. In order to mitigate concerns about generalizability, we supplement our main analysis with an analysis of the SCALES dataset, a smaller dataset but one that covers the population of civil rights cases filed in 2016 and 2017. Figure \ref{fig:fg5} displays the results, which support our previous findings: nontraditional Republican appointees cause fewer settlements and issue more decisions favoring defendants than traditional Republican appointees, while there is no difference among Democratic appointees. 

\begin{figure}[ht]
    \centering
    \caption{Each point plots an average treatment effect on a specific case outcome (depicted on the $x$-axis), along with a 95\% confidence interval using judge-clustered standard errors. These analyses use the SCALES dataset. Full results for this plot are available in Table E.2 in Online Appendix E.}\label{fig:fg5}
    \includegraphics{_img/fg5.pdf}
\end{figure}

So far, we have only discussed whether cases end differently depending on whether they are assigned to nontraditional appointees or traditional appointees. However, we can also examine how the effects depend on which kind of nontraditional appointees are assigned to cases. In Figure \ref{fig:fg6}, we show the effects on case outcomes for various subsets of nontraditional appointees. Each estimate is depicted with a circle and continues to use traditional appointees (i.e., White men) as the reference category. We restrict our analysis to only one outcome variable---settlements---but present analysis for defendant wins in Figure C.1 in Online Appendix C. Note that we only present estimates when we have at least 20 judges in the treatment group.

There is little evidence of effect heterogeneity by subgroup. Because of the large number of statistical tests we are conducting, we focus on the thinner, longer confidence intervals reported in the figure, which have been adjusted for multiple hypothesis testing.\footnote{We adjust the confidence intervals using the Bonferroni method. Because the method will tend to overcorrect when applied to dependent hypotheses, we estimate the number of independent tests using the procedure described in \textcite{derringer2018}.} We find a statistically significant effect only for Republican White women and men of color. Despite those significant effects, it is important to note that there are generally no substantial differences between subgroups of nontraditional appointees. As is apparent from the figure, the confidence intervals across the subgroups are largely overlapping, with most point estimates being covered. Moreover, if one wanted to explicitly compare these effects to one another (e.g., comparing the effect for women of color to the effect for White women), even the adjustments we make to the confidence intervals are not aggressive enough since they only account for each test independently. We would need to adjust for the many additional hypothesis tests that are implied by comparing the estimates with one another. We thus cannot conclude that the overall effects of nontraditional appointees are driven by a particular subset of those judges.

\begin{figure}[ht]
    \centering
    \caption{ 
    Each point plots an average treatment effect on settlement, along with a 95\% confidence interval using judge-clustered standard errors (the smaller bars present adjustments for multiple hypothesis testing using the Bonferroni method, with the number of independent tests estimated). Each estimate shows the estimated effect of assigning cases to judges with specific racial and/or gender characteristics, relative to traditional appointees. Full results for this plot are available in Table E.3 in Online Appendix E.
    }\label{fig:fg6}
    \includegraphics{_img/fg6.pdf}
\end{figure}

\section*{Does Diversity on the Bench Benefit Women or People of Color?}

The significance of diversity on the bench may extend beyond its general impact on case outcomes. In particular, nontraditional appointees might improve outcomes specifically for women and people of color. As \textcite{harris_bias_2019} points out, ``[r]esearch suggests that more women on the courts would lead to more decisions favorable to women [and] more people of color on the courts would lead to more decisions favorable to people of color...'' (p. 243).

We use our dataset to explore this issue. First, we use standard automated methods to (1) identify which plaintiffs in our dataset are human individuals, and (2) predict the race/ethnicity and gender of those individuals. We use the \texttt{gender} package in R to predict each plaintiff's gender and the \texttt{wru} package in R to predict each plaintiff's race or ethnicity.\footnote{Though it is standard in the literature, the computational approach we use to predict the plaintiffs' gender and race/ethnicity is imperfect. To the extent that our automated process misclassifies some of the plaintiffs, this will introduce measurement error into our estimates. Although that error is not correlated with the assigned judge, these algorithms are known to have particular difficulty distinguishing White and Black names. The lack of significant effects on groups that include Black plaintiffs could thus be due to measurement error. As a robustness check, we also present results using a different package for classifying plaintiffs' races, \texttt{predictrace}. See Figure C.3 in Online Appendix C.} Given that data on the racial and gender identities of civil plaintiffs are not typically collected by federal courts, researchers must rely on these cutting-edge tools to predict these characteristics based on available data. These tools are now commonly used in political science \parencite[e.g.,][]{grumbach_sahn_2020,grumbach_sahn_staszak_2022}. We discuss our data coding process in more detail in Online Appendix A. Using our predictions, we identify cases where we predict that all plaintiffs were people of color or White and cases where we predict that all plaintiffs were women or men. In the subsets of cases filed by plaintiffs of color and White plaintiffs (of any gender), we test if judges of color cause different outcomes than traditional appointees. In the subsets of cases filed by women and men (of any race), we test if female judges cause different outcomes than traditional appointees.

We only examine whether our effects vary by the identity of plaintiffs. In the context of Israeli small claims courts, \textcite{shayo_zussman_2011} show that \textit{defendant} identity also affects case outcomes. This is a less salient issue in our setting since the majority of the defendants in our dataset are organizations and governments. Indeed, only a small subset of the cases in our dataset (around 7\%) feature individual (human) plaintiff(s) suing individual (human) defendant(s).

In this analysis, of the judges included in the ``judges of color'' category for Democratic appointees, 56\% are Black, 30\% are Latino, and 16\% are Asian American.\footnote{This adds up to greater than 100\% because some of the judges are mixed race individuals and counted in multiple categories.} Of the judges included in the ``judges of color'' category for Republican appointees, 43\% are Black, 49\% are Latino, and 8\% are Asian American. Among the cases filed by plaintiffs of color who were heard by judges of color, 46\% were filed by plaintiffs of the same race as the judge. Among the cases filed by women who were heard by female judges, 44\% were filed by plaintiffs of the same race as the judge.

\begin{figure}[ht]
    \centering
    \caption{Each point plots an average treatment effect on settlement, along with a 95\% confidence interval using judge-clustered standard errors. The squares show the effect of assigning cases to judges with specific racial and/or gender characteristics in cases where the plaintiffs share the identity of the treatment group appointees. The diamonds show the effect of assigning cases to judges with specific racial and/or gender characteristics in cases where the plaintiffs do not share the identity of the treatment group appointees. Full results for this plot are available in Table E.4 in Online Appendix E.}\label{fig:fg7}
    \includegraphics{_img/fg7.pdf}
\end{figure}

We plot the results of these analyses in Figure \ref{fig:fg7}. First, we do not find evidence of any benefit to plaintiffs of color for having their cases assigned to a judge of color instead of a traditional appointee, nor do we find any benefit to female plaintiffs for having their cases assigned to a woman instead of a traditional appointee (the square shaped point estimates in the figure). Second, we do not find any evidence that the treatment effect for judges of color varies by whether the plaintiffs are people of color or White; nor do we find any evidence that the treatment effect for women appointees varies by whether the plaintiffs are women or men (comparing the square shaped point estimates with the diamond shaped point estimates). 

Because we are subsetting to a smaller number of cases filed by specific kinds of plaintiffs, testing for more nuanced effects than what we present in Figure \ref{fig:fg7} would slice our data very thinly. (For example, there were only around 150 cases filed by Asian plaintiffs that could have been assigned to an Asian American appointee.) Our focus is on testing the more statistically tractable claim summarized in \textcite{harris_bias_2019} that judges of color will produce better outcomes for plaintiffs of color and female judges will produce better outcomes for female plaintiffs. Nonetheless, for curious readers, we present additional results in Figure C.2 in Online Appendix C. We do not find evidence that nontraditional appointees provide more favorable outcomes for plaintiffs who share their identities. 

\section*{Are Republicans Trading Diversity for Ideology?}

What could explain why nontraditional Republican appointees resolve cases more conservatively, while there are apparently no differences among Democratic appointees? We argue that the asymmetry between the parties in our results, as well as the direction of those results, is broadly consistent with a strategic logic of partisan bargaining over judicial nominations in which diversity plays a role in presidents' strategic calculations \parencite[see also][]{asmussen_2011}. This strategic logic starts with the premise that a president requires some buy-in from politicians of the opposing political party in order to successfully appoint his or her judges. This premise is reasonable in our context since a president's judicial nominees must be confirmed by the Senate, and, for the judges appointed in our dataset, both Senate rules and political norms made confirmation difficult without some support from members of both parties.\footnote{Only 39 of the 545 judges in our sample received their commission after Senate Democrats changed Senate rules so that confirmation of district judges could not be subject to filibuster. Moreover, of the judges in our sample appointed before this rule change, only 32 received their commission during periods of unified government and a filibuster-proof majority in the Senate.} 

In Online Appendix D, we analyze a simple formal model of political bargaining over judicial nominations in which two political parties have preferences over both the ideology and demographic characteristics of a judge who is nominated to fill a judicial vacancy. A key parameter in the model is $b_i \in \mathbb{R}$, which is player $i$'s payoff from the president appointing a nontraditional appointee instead of a traditional appointee. So, holding all else equal: $b_i > 0$ implies player $i$ prefers a nontraditional nominee, $b_i < 0$ implies player $i$ prefers a traditional nominee, and $b_i = 0$ implies player $i$ is indifferent about whether the nominee is nontraditional or not. 

Importantly, the player who makes a nomination in the model (i.e., the party that holds the presidency) pays a political cost if it does not win approval from the other party.\footnote{These costs vary. For example, the opposing party could filibuster a nominee (a relatively higher cost) or release embarrassing information about a nominee (a relatively lower cost).} If a nominating president wishes to avoid paying this political cost, then she must provide some concession to the opposing party. This yields the following main result. (See Online Appendix D for the formal analysis and proofs.) 

\newtheorem{proposition}{Proposition}

\begin{proposition}[Trading Diversity]\label{prop:trading-diversity} Let $b_O \in \mathbb{R}$ be the opposition party's payoff from the appointment of a nontraditional appointee. In the unique equilibrium of the model of judicial nominations characterized in Online Appendix D, then relative to a president's traditional nominees, her nontraditional nominees will be
\begin{itemize}
    \item more ideologically congruent with her if $b_O > 0$.
    \item no more or less ideologically congruent with her if $b_O = 0$.
    \item less ideologically congruent with her if $b_O < 0$.
\end{itemize}
\end{proposition}

Assuming that our model provides a reasonable approximation of the parties' incentives when bargaining over judicial nominees, then an empirical implication of Proposition \ref{prop:trading-diversity} is that observed effects among one party's appointees reveal information about the other party's preference for (or against) the appointment of nontraditional nominees. 

In light of this strategic logic, our main empirical findings provide support for the notion that Democrats place substantial weight on appointing a more inclusive federal bench, while Republicans do not. As a result, Democratic presidents cannot gain any ideological advantage by strategically choosing the identity of their nominees since Republican politicians are fairly indifferent about the identities of judicial nominees. On the other hand, Republican presidents can use Democrats' preference for diversity to extract ideological concessions, using diversity in appointments as a tool to appoint more conservative judges who would potentially engender more opposition from Democrats were they White men. We term this phenomenon ``trading diversity'' since Republican presidents can trade diversity for ideology, nominating nontraditional appointees who will act more conservatively on the bench.

Is there corroborating evidence to support the idea that Democrats value diversity while Republicans do not? Recall Figure \ref{fig:diversification}, which demonstrates that Democratic presidents have appointed a substantially larger percentage of traditional appointees than Republicans. Between 1977 and 2020, 48\% of Democratic presidents' appointees were women or judges of color, whereas only 26\% of Republican appointees were women or judges of color. 

On its face, this descriptive statistic lends support for the key implication of our trading diversity argument---that Democrats value diversity more than Republicans. However, what if these patterns in appointments are simply an artifact of differences in the pools of potential appointees available to Democratic presidents and Republican presidents? Perhaps Republican presidents value diversity just as much as Democratic presidents, but they face much higher search costs when attempting to recruit appointees from underrepresented groups.\footnote{See Proposition D.2 in Online Appendix D.} However, using the logic of our theoretical model, this is unlikely. If Republican politicians placed a premium on diversity, then by Proposition \ref{prop:trading-diversity}, we would expect to see nontraditional Democratic appointees making more liberal decisions than the Democratic-appointed White men. We do not see this pattern in our data.

A similar argument allows us to rule out an obvious alternative explanation for our finding that nontraditional Republican appointees make more conservative decisions: namely, that Republican presidents engage in taste-based discrimination against nontraditional appointees. Our trading diversity argument relies on Republicans being indifferent about diversity. If instead they were \textit{hostile} to diversity, then this would (also) lead them to appoint especially conservative nontraditional appointees.\footnote{This uses the well-known logic of taste-based discrimination first articulated by \textcite{becker_1957}, which in this context predicts that Republican presidents would need to get some extra benefit (by way of more conservative nontraditional appointees) in order to be willing to overcome their racial or gender bias and appoint nontraditional appointees.} However, this is not consistent with our other findings. If Republicans were biased against nontraditional appointees, then from Proposition \ref{prop:trading-diversity} we would also expect that bias to result in more conservative nontraditional Democratic appointees since Democratic presidents would need to nominate more conservative nontraditional appointees in order to overcome Republicans' racial or gender bias.

Our trading diversity argument is static and implies that appointing presidents make nominations that reflect the strategic environment at the point in time when they select judges for the bench. That said, our empirical results pool judges across time, comparing cases heard by all judges of color and women with cases heard by all White men. As is apparent from Figure \ref{fig:diversification}, the bench has become more diverse as time passes. Our own dataset, for example, starts in 1995 with 36\% of judges being nontraditional appointees and ends in 2020 with 55\% being nontraditional appointees. There is a risk that the differences (or lack of differences) we see between traditional and nontraditional appointees in our analysis are driven simply by the fact that we are comparing nontraditional appointees disproportionately appointed more recently to traditional appointees appointed longer ago.\footnote{In our dataset, the median appointment year of the traditional appointees is 1993, whereas the median appointment year of the nontraditional appointees is 1999.} If this were true, it would undermine our theoretical argument. To guard against this, we re-estimate our effects from Figure \ref{fig:fg4} controlling for each presiding judge's appointing president. Our effects are nearly identical, suggesting that our main effects are not driven by a time trend (see Figure C.3 in Online Appendix C).

One final alternative explanation is that nontraditional Republican appointees may have been disproportionately appointed in times when there were fewer political constraints facing Republican presidents. If this had been the case, then Republican presidents would have been able to appoint more conservative judges (who just happened to be judges of color or women) because they faced less opposition from Democrats in the Senate. However, historical data on the partisan make-up of the Senate suggests that Republican presidents appoint nontraditional appointees when they face \textit{more} of a political constraint. In our sample, nearly 51 percent of the White men appointed by Republican presidents were confirmed by a Republican majority in the Senate. In contrast, of the judges of color and women appointed by Republican presidents, 62 percent were confirmed when Republicans were in the \textit{minority} in the Senate, consistent with the idea that political vulnerability encourages Republican presidents to trade diversity in order to see their judges confirmed.\footnote{For all district judges appointed by Republican presidents from 1977 to 2016, around 51 percent of the White men were appointed when Republicans held control of the Senate, versus 42 percent of the judges of color and women.}

One might also take issue with the notion that presidents and senators are able to clearly discern ideological differences among those in their pool of potential nominees \parencite[see, for example,][]{hofer_achury_2021}. For the logic of the model to work, presidents and senators must be able to predict which judges will make more or less ideological decisions. However, the logic of the model does not require that they perfectly predict this, only that these predictions are accurate on average.\footnote{A simple extension to the model could incorporate uncertainty over the nominee's ideology, but this would make the model more complicated without altering the core finding that, in expectation, a president can ``trade diversity'' for ideology when members of the opposing party care about diversity on the bench.} Moreover, we think our empirical analysis provides evidence that they do. Among Republican appointees, the differences in outcomes across nontraditional and traditional appointees are similar to the difference between Republican and Democratic appointees. If, as there is little doubt, presidents and senators can discern the across-party differences that affect case outcomes, then it is reasonable that they could also discern the within-party differences that yield effectively equivalent effects.

We argue that the logic of trading diversity provides a compelling explanation for our empirical results. However, our model black-boxes some potential mechanisms that are still consistent with our overall argument. For example, we do not know whether Republicans purposefully choose especially conservative women and people of color from a broad range of potential Republican appointees, or instead whether the pool of potential judges of color and women available to Republican presidents is especially conservative to begin with. Even if the pool of potential nontraditional Republican appointees is especially conservative, this is insufficient for explaining how they end up on the bench. To the extent that Democrats in the Senate have any leverage over Republican nominations, it is the Democratic preference for a more diverse judiciary that enables Republican presidents to appoint women and judges of color who make more conservative decisions. 

\section*{Conclusion}

In this paper, we study how racial and gender diversity in the federal district courts impacts case outcomes. To do so, we analyze an original dataset of around 260,000 civil rights cases decided by 545 district judges over multiple decades in 20 district courts. These districts have jurisdiction over 40\% of the U.S. population and seat 40\% of the federal district judges. We find that among Democratic-appointed district judges, case outcomes are not affected by the identity of the judge assigned; among Republican-appointed district judges, case outcomes are more conservative when assigned to women or judges of color. We confirm these results with a supplemental analysis of the population of cases over a two year period. We further do not find statistical evidence of substantial differences across different subgroups of nontraditional appointees, nor do we find evidence that judges of color or women resolve cases differently when the plaintiff shares their race or gender. 

Our approach differs from prior research on the role of judges' identities in that we shy away from attempting to ``isolate'' the effect of a judge's race or gender on their decision-making. Not only is this isolation strategy a difficult causal enterprise, but it also obscures the impact of diversity on the bench. Indeed, our results provide evidence against the common narrative that appointing a more diverse bench leads to more liberal decisions in certain kinds of salient cases, like discrimination cases. It is possible that, had we tried to control for judges' political ideologies, we would not have learned this.

Our analysis allows us to shine a light on the political process around judicial nominations. In a world where Democrats prioritize both the ideologies and identities of nominees, whereas Republicans prioritize only their ideologies, we would expect Democrats to be willing to endure an ideological cost for their pursuit of diversity and Republicans to exploit that fact. Our results are broadly consistent with this logic since Republican presidents appear to appoint more conservative women and judges of color whereas Democrats achieve no ideological gains based on the racial or gender identities of the judges they appoint.

While our analysis provides a methodological and substantive step forward in understanding how diversity on the bench affects case outcomes, there is much left to do. First, and most obviously, future research should examine whether these effects hold for other kinds of cases and in other courts and time periods. A major challenge in looking at other case types is figuring out how to code the directionality of case outcomes in a way that is substantively meaningful and interpretable. The biggest challenge for studying additional courts and time periods is that access to court data (especially data on judges) is currently highly restricted. Second, future research should develop and empirically explore other ways that diversity on the bench matters. Here, we only examine whether cases are resolved differently depending on the identity of the judge assigned. While this clearly speaks to the larger questions about the impact of diversity on the bench, there are other intriguing questions to be answered. For example, does changing the composition of the judiciary influence which plaintiffs file suit, or what kinds of laws are passed in the first place? Does the presence of a larger number of nontraditional appointees on the bench influence how other judges (namely, White men) resolve cases?

\printbibliography

\newpage
\appendix

\begin{refsection}

\setlength\cftparskip{0pt} 

\setlength\parindent{0pt}
\setlength\parskip{12pt}

\setcounter{footnote}{0}

\doublespacing

\setcounter{table}{0}
\setcounter{figure}{0}
\setcounter{page}{1}
\renewcommand{\thetable}{\Alph{section}.\arabic{table}}
\renewcommand{\thefigure}{\Alph{section}.\arabic{figure}}


\setcounter{equation}{0}
\renewcommand{\theequation}{A.\arabic{equation}}

\setcounter{lemma}{0}
\setcounter{proposition}{0}
\setcounter{defn}{0}
\setcounter{assumption}{0}
\renewcommand{\thelemma}{\Alph{section}.\arabic{lemma}}
\renewcommand{\theproposition}{\Alph{section}.\arabic{proposition}}
\renewcommand{\thedefn}{\Alph{section}.\arabic{defn}}
\renewcommand{\theassumption}{\Alph{section}.\arabic{assumption}}
\renewcommand{\theequation}{\Alph{section}.\arabic{equation}}


\singlespacing

\Large 
\noindent\textbf{Online Appendix for} \\
``\mytitle''

\normalsize

\vspace{1em}

\siauthor

\vspace{1em}

\noindent \mydate

\vspace{1em}

\noindent\textit{For Online Publication}

\vspace{1em}

\noindent\textit{Replication code and data will be made available.}

\vspace{1em}

\tableofcontents

\newpage

\section{Our Dataset}\label{app:data-cleaning}
\setcounter{table}{0}
\setcounter{figure}{0}

We constructed a dataset of every civil rights case filed in 20 district courts over the course of multiple decades. We created the dataset using three sources: (1) the FJC's Integrated Database (\url{https://www.fjc.gov/research/idb/}), (2) an original database of docket sheets collected from PACER, and (3) the FJC's Biographical Directory of Federal Judges (\url{https://www.fjc.gov/history/judges}). We merge the first two data sources together using each case's docket number. We merged the last dataset using judges' names. From these data sources, we coded our main variables of interest.

\vspace{-1em}
\paragraph{Treatment variable}\label{app:first_judge} The treatment variable in our main analysis is a binary variable indicating whether a judge is a ``nontraditional appointee'' or a ``traditional appointee.'' In our dataset, traditional appointees are those whom the FJC's Biographical Directory of Federal Judges classifies as ``White'' in the \textit{Race or Ethnicity} field and ``Male'' in the \textit{Gender} field. Nontraditional appointees are all other judges.

We identified the presiding judge for each case from its docket sheet. At the beginning of each docket sheet, there is an ``Assigned to'' field. We accordingly refer to this as the ``assigned judge.'' However, closer inspection of the docket sheets revealed that this field is updated whenever a case is reassigned to another judge. As a result, the assigned judge is the \textit{last} judge assigned, not the first judge assigned to a case. Because we don't know why or how some cases are reassigned to different judges, we cannot be confident that the assigned judge in each case is randomly assigned. To get around this issue, we used automated methods to scan the docket sheet entries to identify the first judge to take any action in a case. We are sufficiently confident that this judge, who we call the ``first judge,'' is the judge to whom the case is randomly assigned when it is filed. We manually coded a random sample of 200 cases and found that our automated method accurately identified the first assigned judge in 95 percent of the cases.

In the left panel of \Cref{fig:first-judges}, we include a screen grab of a portion of the docket sheet corresponding to one of the cases in our dataset. The docket entries (below the jagged line) demonstrate that District Judge Saundra Brown Armstrong was initially assigned to the case. District Judge Maxine Chesney was eventually reassigned to this case, after which the ``Assigned to'' field was updated to reflect the reassignment. In our dataset, Judge Armstrong is coded as the first judge, and Judge Chesney is coded as the assigned judge. So, Judge Armstrong is the judge we use for our analysis.

\begin{figure}[htb!]
    \centering
    \caption{In the left panel, we provide portions of a screen grab of a docket sheet from a case in our dataset. It shows the kind of information available in federal district court docket sheets, including the identity of the presiding judge. Note that this case was reassigned from one judge to another. In the right panel, we plot the number of cases falling into one of four categories depending on the assigned and first judges.}
    \label{fig:first-judges}
    \begin{tikzpicture}
    \draw[decoration = {zigzag, segment length = 3mm, amplitude = 1mm},decorate, gray, line width = 2pt] (0.01,-1.42in)--(4.1in,-1.42in);
    \draw[line width = 1pt] (0,0in) rectangle (4.1in,-2.9in);
    \node[anchor = north west] at (0,0) {\includegraphics[width=4in]{_img/fgA1-1A.png}};
    \node[anchor = north west] at (0,-1.45in) {\includegraphics[width=4in]{_img/fgA1-1B.png}};

    \node[anchor = west] at (4.15in,-1.65in) {\includegraphics{_img/fgA1-2.pdf}};
    
\end{tikzpicture}
\end{figure}

We use the first judge to code our treatment variable. This means that for some set of cases, we coded a different judge than the one listed in the ``Assigned to'' field. Moreover, for another set of cases, the ``Assigned to'' field is blank even though our automated methods revealed that there is a first judge who was initially assigned to the case.\footnote{Our best guess for why this is the case is that when a judge leaves the bench or otherwise reduces their caseload, some cases may be taken off their docket but never reassigned because they are not currently live cases.}

In the right panel of \Cref{fig:first-judges}, we depict the number of cases where the first and assigned judges are the same, where we used the first judge instead of the assigned judge (because they were different, or the assigned judge was missing), and the number of cases with neither an assigned nor first judge listed.\footnote{These are often cases that are only assigned to a magistrate judge or cases assigned to a ``duty judge'' who hears a large number of smaller cases. Different courts have different rules about these kinds of cases.}

In the main text, we present descriptive statistics on the gender and racial breakdown of the judges in our main dataset. In \Cref{fig:our-judges2}, we present descriptive statistics on the judges in the SCALES dataset, broken down by the party of the appointing president as well as the judges' races and genders. 

\begin{figure}[htb!]
    \centering
    \caption{We plot the number of judges in the SCALES Dataset, broken down by judges' races, genders, and partisanship.}
    \label{fig:our-judges2}
    \includegraphics{_img/fgA2.pdf}
\end{figure}

\vspace{-1em}
\paragraph{Outcome variable} Case outcomes are coded using information from the FJC IDB, as well as from the cases' docket sheets. We primarily rely on the IDB for case outcomes, but prior research suggests that some outcomes in the IDB---and specifically voluntary dismissals---are miscoded \parencite[see][]{hadfield_2004}. A large manual review of the IDB confirmed that many voluntary dismissals were systematically miscoded, as were many judgments for which the IDB did not identify a winning party. We briefly describe the problem and our solution.

The ``voluntary dismissal'' category in the IDB consolidates three substantively different types of outcomes: unilateral plaintiff withdrawals, joint withdrawals filed by both parties, and settlements. A plaintiff can only unilaterally withdraw their case (by way of a voluntary dismissal) before the defendant files an answer or motion for summary judgment (see rule Federal Rule of Civil Procedure 41(a)(1)(A)(i)). This means that if a plaintiff wishes to withdraw their case after an answer or motion for summary judgment is filed, they must get the defendant(s) to agree. This is especially important because many settlements \textit{also} involve a pro forma notice of joint voluntary dismissal. So, joint voluntary dismissals can indicate that either the plaintiff is withdrawing their case or the case has been settled. 

We use information available in the docket sheets to recode voluntary dismissals to capture this additional nuance. Specifically, for every case that the IDB classifies as a voluntary dismissal:

\vspace{-1.5em}
\begin{itemize}[itemsep=0pt]
    \item If the docket sheet explicitly mentions a settlement occurred, we reclassify the case outcome to ``settlement.''
    \item If the docket sheet does not explicitly mention a settlement occurred then we reclassify the case outcome to ``joint voluntary dismissal'' if it either (1) mentions a joint voluntary dismissal or (2) was filed after the defendant filed an answer or a motion for summary judgment, 
    \item If the docket sheet does not explicitly mention a settlement occurred, nor does it reference a joint voluntary dismissal, then we keep it as a ``voluntary dismissal'' which we assume is unilateral.
\end{itemize}

\vspace{-1em}
Finally, for any case coded as a judgment for an unknown party that explicitly references a settlement, we recoded that case outcome to ``settlement.''

As one final check, we looked to see whether the IDB's miscoding was correlated with either the party of judges' appointing presidents or whether they are nontraditional appointees. We do so by using aggressive machine learning algorithms to see if these judge characteristics are (partially) predictive of whether case outcomes are miscoded in the IDB. We use the same process as our randomization test described in \Cref{app:claeec}. We plot the ROC curves in \Cref{fig:roc-miscode}, showing that judge characteristics do \textit{not} help predict whether a case outcome is miscoded. This is evidence of classical (random) measurement error in our dependent variable, which will not systematically bias our estimates.


\begin{figure}[ht]
  \begin{minipage}[c]{3.75in}
  \caption{We used two ensemble machine learning algorithms to predict when case outcomes were miscoded. In the benchmark model, we only use case-level characteristics. In the saturated model, we include case-level characteristics plus assigned judge characteristics (i.e., whether judges were Republican appointees or nontraditional appointees). We plot ROC curves for both models, demonstrating judge characteristics provide no additional predictive power over case characteristics, strong evidence that the IDB's miscoded outcomes are uncorrelated with the assigned judge.}\label{fig:roc-miscode}
  \end{minipage} \hspace{0.25in}
  \begin{minipage}[c]{2.5in}
  \vspace{-1em}
  \includegraphics[width=2.5in]{_img/fgA3.pdf}
  \end{minipage}
\end{figure}

We analyze average treatment effects for the top two outcomes in our dataset:
\begin{itemize}[topsep=0pt,itemsep=0pt]
\item \textit{Settlements} (45\% of our dataset):
\begin{itemize}[topsep=0pt,itemsep=0pt]
    \item All cases in which the IDB's DISP variable takes a value of 13 (case settled).
    \item All cases in which the IDB's DISP variable takes a value of 2 (dismissal for want of prosecution), 3 (dismissal for lack of jurisdiction), 12 (voluntarily dismissed), or 14 (other dismissal), and the docket sheet explicitly mentions a settlement. 
    \item All cases in which the DISP variable takes a value of 4 (default judgment), 5 (consent judgment), 6 (judgment on motion before trial), 7 (judgment after jury verdict), 8 (judgment after directed verdict), 9 (judgment after court trial), 15 (judgment after award of arbitrator), 16 (stayed pending bankruptcy), 17 (other judgment), 18 (statistical closing), 19 (judgment after appeal of magistrate judge affirmed) or 20 (judgment after appeal of magistrate judge denied); the JUDGMENT variable takes a value of 0 (missing), 4 (unknown) or -8 (missing); and the docket sheet explicitly mentions a settlement.
\end{itemize}

\item \textit{Defendant wins} (i.e., involuntary dismissals or judgments for defendant, 33\% of our dataset):
\begin{itemize}[topsep=0pt,itemsep=0pt]
    \item All cases in which the DISP variable takes a value of 2  (dismissal for want of prosecution), 3 (dismissal for lack of jurisdiction), or 14 (other dismissal).
    \item All cases in which the DISP variable takes a value of 4 (default judgment), 5 (consent judgment), 6 (judgment on motion before trial), 7 (judgment after jury verdict), 8 (judgment after directed verdict), 9 (judgment after court trial), 15 (judgment after award of arbitrator), 16 (stayed pending bankruptcy), 17 (other judgment), 18 (statistical closing), 19 (judgment after appeal of magistrate judge affirmed) or 20 (judgment after appeal of magistrate judge denied), and the JUDGMENT variable takes a value of 2 (defendant win).
\end{itemize}

\end{itemize}

The remaining outcomes are: joint voluntary dismissals (7\%), (unilateral) voluntary dismissals (4\%), remands to state court (3\%), judgments for an unknown party (3\%), judgments for the plaintiff (3\%), inter-district transfer (2\%), remand to agency (0.1\%). 

\vspace{-1em}
\paragraph{Plaintiffs' races and genders} 
In one of our analyses, we investigate whether nontraditional appointees cause different case outcomes based on shared racial or gender identities with the plaintiffs. The FJC does not report plaintiffs' genders or races, so we use automated techniques to predict plaintiffs' race and gender based on their name (extracted from each case's docket sheet) and their county of residence (extracted from the IDB). As some plaintiffs in our sample are government entities, businesses, or other organizations, we employ automated methods to exclude non-human plaintiffs. These methods have undergone thorough validation and utilize custom dictionary approaches that we vetted extensively.

We predicted the gender of each plaintiff using two methods: First, we utilized the \texttt{gender} package by \textcite{blevins_jane_2015} to infer gender based on historical data from the U.S. Social Security Administration. This method used a plaintiff's first name to classify them as ``male'' or ``female'' based on the likelihood a name was associated with a particular gender at a given point in time. For plaintiffs whose first names yielded no clear prediction, we reran the procedure using a plaintiff's middle name, if available. As a second strategy, we classified gender using the Integrated Public Use Microdata Series method in the \texttt{gender} package. The two classifications were in agreement for 91\% of plaintiffs. In our analysis, we use the SSA method and supplement it with the IPUMS method when the SSA method yields no prediction.

To determine the race/ethnicity of plaintiffs in our sample, we used the \texttt{wru} package by \textcite{imai_improving_2016}, which predicts a person's race based on their surname and geolocation (county). This package, used by others like \textcite{grumbach_sahn_2020}, uses the U.S. Census Bureau’s Surname List (2000 version) and geocoded voter registration records to predict the probability that a plaintiff is White, Black, Hispanic, Asian, or Other. When a most-likely race could not be established, we used the \texttt{wru} package again to predict race based on surname only (i.e., without county). We used the resulting probabilities to code a prediction of the most-likely race of each plaintiff. Specifically, we compared the predicted probabilities for each race and coded a plaintiff's race when the probability they were one race (e.g., White) exceeded the combined probabilities of all other races (e.g., non-White). This was a conservative coding decision that ensures our analysis does not include cases with plaintiffs that \texttt{wru} cannot easily classify. As a second strategy, we employed the \texttt{predictrace} package developed by \textcite{predictrace}, which uses first names and surnames to predict the most prevalent race associated with each.

The \texttt{wru} and \texttt{predictrace} packages predict the same race for 58\% of the plaintiffs in our dataset and a different race for 12\% of the plaintiffs in our dataset. For the remaining plaintiffs: 27\% were missing a prediction from \texttt{wru}, 3\% were missing a prediction from \texttt{predictrace}, and 3\% were missing predictions from both. Of the 11\% of plaintiffs that were coded differently across the two packages, the vast majority of them were coded as White by \texttt{predictrace} and either Black or Hispanic by \texttt{wru}.

For the \texttt{wru} package (whose predictions we use in the main text), we summarize the distribution of predicted probabilities of plaintiff race in \Cref{tab:app_race_pred}. Specifically, for each plaintiff's racial classification (left column), we present the mean of the predicted probabilities (the five columns on the right). As expected from the prior literature, the algorithm has the hardest time distinguishing Black and White names.

\input{_tab/tabA1}

Because litigant race/ethnicity is not provided to us via case filings, we cannot directly assess the accuracy of the \texttt{wru} and \texttt{predictrace} predictions in our sample. However, we conducted a robustness exercise by comparing our automated prediction method against the races/ethnicities of federal district judges appointed since President Carter, whose races are reported by the FJC. The classifications generated by \texttt{wru} matched the reported race for 78\% of judges in the FJC database. Of the inaccurate predictions, the vast majority were Black judges that \texttt{wru} predicted were most likely White or White judges that were predicted as most likely Black, or White judges that we did not assign a most-likely race. The judge race predicted by \texttt{predictrace} matched the reported race for 86\% of judges in the FJC database. The majority of inaccurate predictions were also Black judges that \texttt{predictrace} coded as White.

Using automated methods, like \cite{imai_improving_2016}'s Bayesian Improved Surname Geocoding (BISG), to predict race/ethnicity based on names is a relatively new endeavor in the literature that has been enabled by the recent development of powerful statistical algorithms and large datasets. Despite their novelty, these methods, notably \texttt{wru}, have gained widespread acceptance in the literature. For instance, \texttt{wru} has been used by Abott and Magazinnik (2020) to identify Latino school board candidates and by \cite{grumbach_sahn_staszak_2022} to predict the race of campaign contributors. For a comprehensive overview of the increasing use of automated methods to predict race in political science and other disciplines, we refer readers to \href{https://static.cambridge.org/content/id/urn:cambridge.org:id:article:S1047198721000310/resource/name/S1047198721000310sup001.pdf}{Clark et al. (2021)}. As for the overall accuracy of these classification methods, a recent research note by Rosenman et al. (2023) found that the \texttt{wru} classifier had just a 13.2\% error rate when applied to a validated sample of 38 million voters. Thus, we rely on \texttt{wru} in the main text, and, as a robustness check on our main findings, \texttt{predictrace} in our online appendix (see \Cref{fig:fgC2}).

\vspace{-1em}
\paragraph{Types of civil rights cases} 

To gain insight into the types of civil rights involved in our case sample, we randomly sampled 100 original complaints in our data, read the full complaints, and categorized them by the type of civil rights discussed in each complaint. The distribution of our sample can be found in Table \ref{tab:civil_rights_case_type}. 

\input{_tab/tabA2}

\section{Leveraging the Random Assignment of Cases to Judges}\label{app:claeec}
\setcounter{table}{0}
\setcounter{figure}{0}

We rely on the random assignment of cases to judges in federal district courts to estimate causal effects. However, ensuring that we properly leverage random assignment requires some additional work, which we outline here.

\vspace{-1em}
\paragraph{Collect qualitative information about case assignment procedures} 

We collected qualitative information from each court's General Orders and Local Rules relating to the processes used to assign cases to the judges in our sample. Based on our review of this information, we concluded that it's standard practice for cases to be randomly assigned to judges as they are filed and that this random assignment typically occurs within the division of the district in which the case was filed.\footnote{For example, the Northern District of California has four divisions (sometimes called offices, duty stations, or courthouses), based in Eureka, Oakland, San Francisco, and San José. A plaintiff can file their case in any of these divisions (see page 3 of \url{https://cand.uscourts.gov/filelibrary/1243/Atty_Case_Opening_Guide_2019.pdf}).} 

Practically speaking, within a division and unit of time, we can consider all cases filed to be randomly allocated to the available district judges.\footnote{Semi-retired ``senior judges'' have more latitude over how many cases they are assigned but not \textit{which} cases they are assigned. We find no evidence that cases are assigned non-randomly to these judges, so we include them in our analysis. For substantive reasons, we drop cases heard by judges from another court who are temporarily assigned to hear cases in a particular court.} (Again, it is insufficient to assume random case assignment within each district.) We use the filing year as our unit of time. We refer to each district-division-year combination as a ``randomization block.''\footnote{Ideally we would use a finer unit of time, such as quarter, month, or week. However, the finer our measure, the more difficult it is for us to estimate effects since it will cause us to dramatically reduce our within-block sample size. Moreover, as we demonstrate below, cases appear to be as-if randomly assigned within each district-division-year.} In the main text, we discuss how we recover the random assignment by way of our statistical estimation model. 

\vspace{-1em}
\paragraph{Address problems with judge assignment in some cases} Each case in our dataset must have exactly one presiding district judge. We first drop all cases with no presiding district judge. Then we took the following additional steps:

Step 1: We dropped any case that we have reason to believe was not randomly assigned to a judge via the normal randomized procedure. This included: (1) cases that are classified as multi-district litigation cases; (2) cases that appear to be one of several ``related cases,'' which are assigned to a specific judge because they were filed on the same day, in the same district-division, with the same nature of suit code; or (3) cases that were either post-appeal actions or appeals of a magistrate judge's decision.\footnote{Note that each case can have multiple observations in the FJC's IDB since a case can be terminated and reopened multiple times. For each case (identified by a docket number), we only take one of the entries in the FJC's IDB: the last one that occurs before post-appeal actions. Our rationale for this is that the initially assigned judge may cause the case to be reopened repeatedly if, for example, that judge is prone to dismiss cases for minor defects.}

Step 2: We dropped any case that was heard by a judge who does not sit in the filing court (a ``visiting judge'') or any judge appointed by a president before Jimmy Carter or after Barack Obama. The justification for the first of these is that we do not want our effects to be influenced by judges who do not regularly sit in the district and who may only hear a limited number of cases. The justification for the second of these is that these judges do not hear a large number of cases. Neither of these decisions creates a problem for our causal identification strategy since the appointing president is a pre-treatment variable.

Step 3: We drop all cases heard within a specific court-division-year block in which fewer than five cases are heard by nontraditional appointees or fewer than five cases are heard by White men. We do this following the recommendations provided in \textcite{green_lab_2016}. This is a consequential data-cleaning step, as it forces us to drop a sizeable number of cases that are (in principle) randomly assigned to a judge. We also estimate our main effects with different minimum thresholds, one and ten, and our results don't change.

\vspace{-1em}
\paragraph{Provide quantitative evidence in support of our approach} 

In our analysis, we assume cases are assigned as-if random conditional on court-division and case filing year. We conduct an aggressive test of this assumption using a computational approach described in \textcite{hubert_copus_jop}. First, we use a machine learning algorithm to predict unit-level treatment status using district-division-year randomization blocks, since our identifying assumption is that potential outcomes are independent of treatment only after conditioning on these randomization blocks. We call this our ``benchmark model'' and, as expected, we find that these randomization blocks are predictive of treatment. Second, we use the same aggressive algorithm to predict unit-level treatment status using district-division-year randomization blocks plus a collection of additional pre-treatment variables. We call this our ``saturated model.''\footnote{Essentially, we estimate a propensity score model using a cross-validated machine learning algorithm that more aggressively targets accurate predictions than the commonly used logistic regression.} If cases are truly randomly assigned to judges, then these additional pre-treatment variables should not provide any additional predictive power above and beyond the district-division-year randomization blocks.

Our initial tests revealed an imbalance in the cases heard by Republican appointees, suggesting the possibility of non-random assignment. Further inspection revealed that the imbalance was due to a pattern of case assignments to one judge. As the left panel of \Cref{fig:preska1} illustrates, upon becoming chief judge for the Southern District of New York (NYSD) in 2009, Judge Loretta Preska (a Republican appointee) began hearing a much larger number of civil rights cases. The center panel strongly suggests that these additional cases were not randomly assigned; the rate at which she granted judgment for defendants also increased precipitously and suddenly. These may have been especially strong cases for defendants. The right panel shows that, when applied only to NYSD cases, our test for imbalance detected severe violations of randomized case assignment (visualized with standard ROC curves for the benchmark and saturated models). We thus dropped from our dataset all cases heard by Judge Preska during her tenure as chief judge. 

\begin{figure}[htb!]
    \centering
    \caption{
    The left panel plots the yearly number of cases assigned to each judge in the Southern District of New York (NYSD). Black dots are for Judge Loretta Preska, and gray dots are for all other judges in the court. Triangles indicate which judge is chief judge. The time period when Judge Preska served as chief judge is shaded gray. The middle panel is similar, except it depicts the number of judgments for the defendant issued by each judge in NYSD. Finally, the right panel plots ROC curves for the randomization test described in the main text, when we include cases heard by Judge Preska.
    }\label{fig:preska1}
    \includegraphics{_img/fgB1.pdf}
\end{figure}

After dropping these cases, we again performed our machine learning randomization test for cases assigned to Democratic appointees and cases assigned to Republican appointees. We do this for both our main dataset and the SCALES dataset. In all tests, the saturated model does not provide additional predictive power over the benchmark model, supporting the assumption of randomized case assignment. We illustrate this with standard ROC curves plotted in the left half of \Cref{fig:randomization}. In the right half, we present this information in a slightly different way, with eQQ plots comparing the distribution of propensity scores from the benchmark and saturated models. Note that the distributions are nearly identical, again supporting the assumption of randomized case assignment. 

All analyses reported in the paper are conducted without cases assigned to Judge Preska during the period she was chief judge. Results don't change if we drop cases heard by all chief judges or drop all cases filed in NYSD during Judge Preska's term as chief judge.

\begin{figure}[htb!]
    \centering
    \caption{We plot the results of our case randomization tests for the main dataset (Panel A) and the SCALES dataset (Panel B). The left plots show ROC curves for the benchmark (gray lines) and saturated models (black lines) described in the main text. (Note: the gray lines are almost completely covered by the black ones.) The right panels are eQQ plots comparing the distributions of propensity scores from the benchmark and saturated models. 
    }\label{fig:randomization}
    \includegraphics{_img/fgB2.pdf}
    
\end{figure}

\clearpage
\section{Additional Analyses and Robustness Checks}\label{app:robustness}
\setcounter{table}{0}
\setcounter{figure}{0}

\begin{figure}[ht]
    \centering
    \caption{Each point plots an average treatment effect on defendant wins, along with a 95\% confidence interval using judge-clustered standard errors (the smaller bars present adjustments for multiple hypothesis testing using the Bonferroni method, with the number of independent tests estimated). Each estimate shows the estimated effect of assigning cases to judges with specific racial and/or gender characteristics, relative to traditional appointees. For each set of estimates, we present the number of cases in our analysis (top number) and the number of treatment/control judges (bottom number). Note: as in all our analyses, we only provide estimates if there are at least 20 judges in the treatment group. Full results for this plot are available in \Cref{tab:fgC1} in Online Appendix E.}\label{fig:fgC1}
    \includegraphics{_img/fgC1.pdf}
\end{figure}

\begin{figure}[ht]
    \centering
    \caption{Purple circles display average treatment effects of assigning a case to a subgroup of nontraditional appointees (relative to traditional appointees) in the set of cases brought by plaintiffs who share the identity of the treatment judges. Green squares display effects in the set of cases brought by plaintiffs who do \textit{not} share the identity of the treatment judges. Darker purple/green indicates we used the \texttt{wru} package to code plaintiff race and lighter purple/green indicates we used the \texttt{predictrace} package. The 95\% confidence intervals are not adjusted for multiple hypothesis testing. Full results for this plot are available in \Cref{tab:fgC2} in Online Appendix E.}\label{fig:fgC2}
    \includegraphics{_img/fgC2.pdf}
\end{figure}

\begin{figure}[ht]
    \centering
    \caption{Average treatment effects for assignment to nontraditional appointees, with and without controls for appointing president, along with 95\% confidence intervals. Full results for this plot are available in \Cref{tab:fgC3} in Online Appendix E.}\label{fig:fgC3}
    \includegraphics{_img/fgC3.pdf}
\end{figure}

\FloatBarrier % Prevents floats from moving past this point

\section{Formal Model of Trading Diversity}\label{app:model}
\setcounter{table}{0}
\setcounter{figure}{0}

We analyze a game that resembles a classic agenda-setting model, most prominently articulated by \textcite{romer_rosenthal_1978}. Our model presupposes a judicial vacancy and features two players, $D$ and $R$, who we index by $i$. The game begins with one player being chosen to be the president ($P$); the other is the opposition party in the Senate ($O$). The president proposes a nominee, and the Senate must decide whether to approve the nominee. We abstract away from internal Senate decision-making and simply assume that the key decision-maker is the opposition party.

\vspace{-1em}
\paragraph{Sequence} The game proceeds as follows:

\vspace{-1.5em}
\begin{enumerate}[itemsep=0pt]
    \item Nature chooses $P \in \{D, R\}$ and $b_P$, which are publicly revealed.
    \item $P \in \{D,R\}$ chooses nominee ideology $x \in \mathbb{R}$ and whether they are from an underrepresented group, $d \in \{0,1\}$.
    \item Nature reveals whether the nominee is qualified, $q \in \{0, 1\}$, where $\Pr(q = 0) \equiv \nu < 1/2$.
    \item $O$ decides whether to support or oppose the nominee, $s \in \{0,1\}$.
    \item If $O$ supports, the nominee is appointed, otherwise, the players receive default payoffs.\footnote{These could be payoffs corresponding to the nominee being rejected, or other kinds of political costs.} 
    \item Payoffs are realized.
\end{enumerate}

We allow Nature to choose who is president and the benefit that president gets from diversity. This uncertainty plays no role in the players' strategic calculations as the information is publicly revealed. We include this step so that we can talk more clearly about the ``likelihood'' that the president nominates a nontraditional appointee. For completeness, we assume $P$ is drawn from a binomial distribution with $\Pr(P = R) = \rho$ and $\Pr(P = D) = 1 - \rho$ and $b_P$ is drawn from some distribution with strictly positive mass on $\mathbb{R}$, and with cdf $F_P$ that depends on the party of the president.

\vspace{-1em}
\paragraph{Nominees} There is a pool of available nominees for each party: $X \times D = \mathbb{R} \times \{0,1\}$. A nominee is a pair, $(x,d)$, indicating their ideology and whether they are a nontraditional appointee. 

We allow for the possibility that the nominee is discovered to be unqualified during the Senate's review of the nominee's qualifications. Formally, after the president announces the nomination, Nature reveals whether the nominee is qualified, $q \in \{0,1\}$, where $\Pr(q = 0) = \nu < 1/2$. We assume that the nominee is more likely to be qualified than not qualified. This extra step in the game does not fundamentally alter the players' strategic calculations, but it does ensure that there will be rejections in equilibrium. 

\paragraph{Players and payoffs}
The payoff function for a player $i \in \{D, R\}$ is:
\begin{align*}
u_i = 
\begin{dcases}
    (b_i - c_i p_i) d - (x - \hat{x}_i)^2 - (1-q)\kappa_i & \text{if nominee is accepted}\\
    \overline{u}_i & \text{if nominee is rejected}
\end{dcases}
\end{align*}
where:

\vspace{-1.5em}

\begin{itemize}[itemsep=0pt]
    \item $p_i \in \{0,1\}$ is an indicator variable for whether $i$ is the president/proposer.
    \item $x \in X = \mathbb{R}$ is the ideology of the nominee, and $d \in D = \{0,1\}$ indicates whether the nominee is considered a ``nontraditional'' nominee (see discussion of this terminology in the main text).
    \item $b_i \in \mathbb{R}$ is the benefit that $i$ gets from a nontraditional appointee relative to a traditional appointee. Note: this allows for the case where $b_i < 0$, indicating a preference for traditional appointees.
    \item $c_i \geq 0$ is the ``search cost'' for nominating a nontraditional nominee. Notes: (1) this is only paid if $i$ is the proposer, and (2) this can vary by party.
    \item $\overline{u}_i \in \mathbb{R}$ is $i$'s default payoff from a nominee being opposed by the opposition party. We interpret this parameter as a measure of the ``strength'' of the opposition party.
    \item $\kappa_i > 0$ is the cost associated with an ``unqualified'' nominee being appointed. 
\end{itemize}

We make the following scope assumptions on the payoffs.

\begin{assumption}\label{ass:opp-reject1} Each player strictly prefers a nominee that is at their own ideal point (regardless of diversity concerns). Formally, $\min\{0, b_i\} > \overline{u}_i$.
\end{assumption}

\begin{assumption}\label{ass:opp-reject2} Each player strictly prefers a nomination fails if the ideology of the nominee is at the other player's ideal point. Formally, $\max\{0, b_i\} - (\hat{x}_j - \hat{x}_i)^2 < \overline{u}_i$.
\end{assumption}

\begin{assumption}\label{ass:opp-reject3} Each player strictly prefers a nomination fails if the nominee is revealed to be an unqualified candidate. Formally, $\max\{0, b_i\} + \kappa_i < \overline{u}_i$.
\end{assumption}

\Cref{ass:opp-reject1} ensures that each party's most preferred outcome is a nominee at its ideal point. \Cref{ass:opp-reject2} allows us to rule out corner solutions in which a president can nominate a candidate at her own ideal point. This ensures that the president faces a genuine ideological trade-off when she makes a nomination. \Cref{ass:opp-reject3} ensures that the opposition always rejects a candidate who is revealed to be unqualified.

We will characterize subgame perfect equilibria, which we find using backward induction. As is standard, we will denote an equilibrium strategy with a star and a generic strategy without a star.

\vspace{-1em}
\paragraph{Senate's strategy} $O$ supports a nominee $(x,d)$ if and only if
\begin{align*}
    b_O d - (x - \hat{x}_O)^2 - (1-q)\kappa_i
    \geq 
    \overline{u}_O
\end{align*}
The $\Leftarrow$ direction is obvious. However, as is standard in agenda-setting models, the $\Rightarrow$ direction will hold in any equilibrium since if $O$ rejects when indifferent, $P$ has no maximizer.

Immediately, by \Cref{ass:opp-reject3}, $s^* = 0$ if $q = 0$. 

Next, consider the case in which $q = 1$. Let $\tilde{x}(d)$ indicate the  most ideologically congruent nominee $P$ can get (as a function of $d$) while satisfying $O$'s constraint, which is implicitly defined by:
\begin{align}\label{eq:accept}
    b_O d - (\tilde{x}(d) - \hat{x}_O)^2
    =
    \overline{u}_O
\end{align}
Since $b_O d > \overline{u}_O$ (by \Cref{ass:opp-reject1}),\footnote{Note: if this condition were to fail, then (\ref{eq:accept}) has no solution, and $O$ cannot be induced to accept the nominee.} we can solve for the $x$ that induces acceptance:
\begin{align*}
    \tilde{x}(d) = 
    \begin{dcases}
    \hat{x}_O + \sqrt{b_O d - \overline{u}_O} &\text{if }\hat{x}_O < \hat{x}_P\\
    \hat{x}_O - \sqrt{b_O d - \overline{u}_O} &\text{if }\hat{x}_O > \hat{x}_P
    \end{dcases}
\end{align*}

\begin{lemma} Given \Cref{ass:opp-reject1} and \Cref{ass:opp-reject3}, $s^*(x,d,q) = 1$ if and only if $q = 1$ and $|x - \hat{x}_O| \leq |\tilde{x}(d) - \hat{x}_O|$. 
\end{lemma}

\begin{proof} In the preceding text.
\end{proof}

\vspace{-1em}
\paragraph{President's strategy} 

First, assume that $P$ satisfies $O$'s constraint. 

Since $O$ never supports an unqualified nominee ($q = 0$), this induces some uncertainty for $P$. Then, $P$'s ex ante expected payoff from appointing a nominee $(x,d)$ that satisfies $O$'s constraint is
\begin{align*}
	(1-\nu)[(b_P - c_P)d - (x - \hat{x}_P)^2] + \nu \overline{u}_P
\end{align*}
By \Cref{ass:opp-reject2}, $O$ always rejects a nominee with $x = \hat{x}_P$, so $P$ will seek to get the most ideologically congruent judge she can possibly get. 

We make the following assumption to simplify the exposition by reducing the number of substantively trivial cases we need to consider. Note that this only has bite for a knife-edge region of the parameter space. 

\begin{assumption}\label{ass:indifferent} When indifferent, the president chooses $d = 1$.
\end{assumption}

Then, given \Cref{ass:indifferent}, $P$ nominates a nontraditional nominee if and only if:
\begin{align*}
	b_P - c_P - (\tilde{x}(1) - \hat{x}_P)^2
    \geq 
    - (\tilde{x}(0) - \hat{x}_P)^2
\end{align*}
(Note: the weakness of this inequality comes from \Cref{ass:indifferent}.)   
Whether this condition holds depends on the relative weight $P$ places on ideology and diversity. If $b_P > c_P$, then diversity directly increases $P$'s payoff. On the other hand, if $b_P < c_P$, then diversity directly lowers $P$'s payoffs. Rearranging yields:
\begin{align}\label{eq:diverse}
    b_P
    \geq 
    c_P + 
    (\tilde{x}(1) - \hat{x}_P)^2 - (\tilde{x}(0) - \hat{x}_P)^2
   	\equiv \hat{b}_P
\end{align}
Let $\hat{b}_P$ be the value of $b_P$ such that the condition binds. Then, if $b_P \geq \hat{b}_P$, then $P$ will nominate a nontraditional appointee. Note $\hat{b}_P$ can be (weakly) negative, so it is possible that the president cannot be induced to nominate a nontraditional nominee since we require $b_P > 0$. When does this happen?

\textbf{Case 1:} Suppose $b_O > 0$. Then, it is straightforward to see that $P$ can get a more ideologically congruent appointment by nominating a nontraditional nominee: $|\tilde{x}(1) - \hat{x}_P| < |\tilde{x}(0) - \hat{x}_P|$. Then, the right-hand side of (\ref{eq:diverse}) is strictly negative. 

\textbf{Case 1A:} Suppose that $b_P - c_P > 0$. Then, $P$ will always nominate a nontraditional nominee, even if the nontraditional nominee is not more ideologically congruent (i.e., if $\tilde{x}(0) \approx \tilde{x}(1)$). 

\textbf{Case 1B:} Suppose that $b_P - c_P < 0$. Then, $P$ will nominate a nontraditional nominee if and only if the nontraditional nominee is more ideologically congruent. Moreover, as $b_P - c_P$ declines, $P$ requires a more ideologically congruent nominee in order to be willing to appoint a nontraditional nominee. 

\textbf{Case 2:} Suppose $b_O < 0$. Then, it is straightforward to see that $P$ can get a more ideologically congruent appointment by nominating a traditional nominee: $|\tilde{x}(1) - \hat{x}_P| > |\tilde{x}(0) - \hat{x}_P|$. This is because the Senate is biased against nontraditional nominees. Then, the right-hand side of (\ref{eq:diverse}) is strictly positive. So, at a minimum, $P$ must value diversity in order for the condition to hold. Moreover, to overcome $O$'s opposition (i.e, the relatively larger gap between $\tilde{x}(0)$ and $\tilde{x}(1)$), she must \textit{highly} value diversity in order to be willing to nominate an appointee representing a nontraditional group. 

\textbf{Case 3:} Suppose $b_O = 0$. Then, it is straightforward to see that $P$ cannot get a more ideologically congruent appointment by nominating a nontraditional or traditional nominee: $|\tilde{x}(1) - \hat{x}_P| = |\tilde{x}(0) - \hat{x}_P|$. In this case, the Senate gets no positive or negative payoff from diversity. Then, the right-hand side of (\ref{eq:diverse}) is zero and $P$ is willing to appoint a nontraditional appointee if $b_P > c_P$.

Recall the above analysis proceeded by supposing that $P$ satisfies $O$'s acceptance constraint. We now characterize the conditions under which this occurs. First note that if $P$ satisfies $O$'s constraint, she will select either $(\tilde{x}(1), 1)$ or $(\tilde{x}(0), 0)$. Let:
\begin{align*}
    \tilde{U}_P(d) = \begin{dcases}
    - (\tilde{x}(0) - \hat{x}_P)^2& \text{if }d = 0\\
    (b_P - c_P) - (\tilde{x}(1) - \hat{x}_P)^2& \text{if }d = 1
    \end{dcases}
\end{align*}
Let $\overline{u}_P^{\text{accept}}$ be defined by:
\begin{align*}
    \overline{u}_P^{\text{accept}} \equiv \min\{\tilde{U}_P(0), \tilde{U}_P(1)\}
\end{align*}
Then, it is weakly optimal for $P$ to satisfy $O$'s constraint if $\overline{u}_P^{\text{accept}} \geq \overline{u}_P$, and strictly optimal if the condition holds strictly. In the spirit of classical bargaining models, we opt to focus on cases in which rejection is the worst outcome for both players. So, we make this additional assumption:

\begin{assumption}\label{ass:pres-reject} $ \overline{u}_P^{\text{accept}} > \overline{u}_P$.    
\end{assumption}

We can now characterize a unique equilibrium of the model.

\begin{proposition}\label{prop:equilibrium} Given Assumptions \ref{ass:opp-reject1}--\ref{ass:pres-reject}, there is a unique subgame perfect equilibrium of the game that is characterized as follows:

\vspace{-1em}
\begin{itemize}
    \item $O$ supports the nominee ($s^* = 1$) if and only if $q = 1$ and $|x - \hat{x}_O| \leq |\tilde{x}(d) - \hat{x}_O|$.
    \item $P$ proposes a nominee $(x^*, d^*)$ such that $x^* = \tilde{x}(d)$ and $d^* = 1$ if and only if $b_P \geq \hat{b}_P$.
\end{itemize}
\end{proposition}

\begin{proof}[Proof of \Cref{prop:equilibrium}] In the preceding text.
\end{proof}

Note that if \Cref{ass:pres-reject} fails, then it is possible (although not guaranteed) to get rejection on the equilibrium path if there is no nominee $P$ could choose to make her better off than her default payoff. Clearly, this would be a substantively strange situation, as it implies the president would be better off with her nominees being opposed/rejected.

\vspace{-1em}
\paragraph{Empirical Implications} In the remainder, we characterize several empirical implications of our model, in addition to the one in the main text.

\textbf{Proposition 1.} \textit{In the main text.}

\begin{proof}[Proof of \Cref{prop:trading-diversity}] Follows directly from comparing (\ref{eq:accept}) when setting $d = 1$ and when $d = 0$.
\end{proof}

Let $\delta^*$ be the ex ante equilibrium probability that $P$ nominates a nontraditional nominee: $\delta^* \equiv \Pr(b_P > \hat{b}_P) = 1-F_P(\hat{b}_P)$. 

\begin{proposition}\label{prop:nominating-costs} As the cost of nominating a nontraditional appointee ($c_P$) increases, the president is less likely to appoint a nontraditional appointee ($\delta^*$ decreases).
\end{proposition}

\begin{proof}[Proof of \Cref{prop:nominating-costs}] First, note that $\hat{b}_P$ (defined in equation \ref{eq:diverse}) increases as $c_P$ increases. Since $F_P$ is a cdf on a distribution with positive mass on $\mathbb{R}$, it is increasing in its argument. Then, $\delta^* = 1 - F_P(\hat{b}_P)$ is decreasing in $\hat{b}_P$.
\end{proof}

\section{Regression Tables}\label{app:regression_results}
\setcounter{table}{0}
\setcounter{figure}{0}

On the last pages of this appendix, we present regression tables corresponding to the average treatment effects we plot in the main text and the preceding sections of the Online Appendix.

\printbibliography[heading=subbibliography]

\newpage 
\input{_tab/fg4}

\input{_tab/fg5}

\newpage
\input{_tab/fg6}

\newpage
\input{_tab/fg7}

\input{_tab/fgC3}

\newpage
\input{_tab/fgC1}

\newpage
\input{_tab/fgC2}

\end{refsection}

\end{document}